Predictiveness curves in virtual screening
نویسندگان
چکیده
BACKGROUND In the present work, we aim to transfer to the field of virtual screening the predictiveness curve, a metric that has been advocated in clinical epidemiology. The literature describes the use of predictiveness curves to evaluate the performances of biological markers to formulate diagnoses, prognoses and assess disease risks, assess the fit of risk models, and estimate the clinical utility of a model when applied to a population. Similarly, we use logistic regression models to calculate activity probabilities related to the scores that the compounds obtained in virtual screening experiments. The predictiveness curve can provide an intuitive and graphical tool to compare the predictive power of virtual screening methods. RESULTS Similarly to ROC curves, predictiveness curves are functions of the distribution of the scores and provide a common scale for the evaluation of virtual screening methods. Contrarily to ROC curves, the dispersion of the scores is well described by predictiveness curves. This property allows the quantification of the predictive performance of virtual screening methods on a fraction of a given molecular dataset and makes the predictiveness curve an efficient tool to address the early recognition problem. To this last end, we introduce the use of the total gain and partial total gain to quantify recognition and early recognition of active compounds attributed to the variations of the scores obtained with virtual screening methods. Additionally to its usefulness in the evaluation of virtual screening methods, predictiveness curves can be used to define optimal score thresholds for the selection of compounds to be tested experimentally in a drug discovery program. We illustrate the use of predictiveness curves as a complement to ROC on the results of a virtual screening of the Directory of Useful Decoys datasets using three different methods (Surflex-dock, ICM, Autodock Vina). CONCLUSION The predictiveness curves cover different aspects of the predictive power of the scores, allowing a detailed evaluation of the performance of virtual screening methods. We believe predictiveness curves efficiently complete the set of tools available for the analysis of virtual screening results.
منابع مشابه
Screening Explorer-An Interactive Tool for the Analysis of Screening Results
Screening Explorer is a web-based application that allows for an intuitive evaluation of the results of screening experiments using complementary metrics in the field. The usual evaluation of screening results implies the separate generation and apprehension of the ROC, predictiveness, and enrichment curves and their global metrics. Similarly, partial metrics need to be calculated repeatedly fo...
متن کاملSemiparametric methods for evaluating the covariate-specific predictiveness of continuous markers in matched case-control studies.
To assess the value of a continuous marker in predicting the risk of a disease, a graphical tool called the predictiveness curve has been proposed. It characterizes the marker's predictiveness, or capacity to risk stratify the population by displaying the distribution of risk endowed by the marker. Methods for making inference about the curve and for comparing curves in a general population hav...
متن کاملEvaluating the predictiveness of a continuous marker.
Consider a continuous marker for predicting a binary outcome. For example, the serum concentration of prostate specific antigen may be used to calculate the risk of finding prostate cancer in a biopsy. In this article, we argue that the predictive capacity of a marker has to do with the population distribution of risk given the marker and suggest a graphical tool, the predictiveness curve, that...
متن کاملA parametric ROC model-based approach for evaluating the predictiveness of continuous markers in case-control studies.
The predictiveness curve shows the population distribution of risk endowed by a marker or risk prediction model. It provides a means for assessing the model's capacity for stratifying the population according to risk. Methods for making inference about the predictiveness curve have been developed using cross-sectional or cohort data. Here we consider inference based on case-control studies, whi...
متن کاملThe residual-based predictiveness curve: A visual tool to assess the performance of prediction models.
It is agreed among biostatisticians that prediction models for binary outcomes should satisfy two essential criteria: first, a prediction model should have a high discriminatory power, implying that it is able to clearly separate cases from controls. Second, the model should be well calibrated, meaning that the predicted risks should closely agree with the relative frequencies observed in the d...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره 7 شماره
صفحات -
تاریخ انتشار 2015